Search CORE

89 research outputs found

Simulation experiments for similarity indexes between two hierarchical clusterings

Author: MORLINI Isabella
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Morlini and Zani (2012) have proposed a new dissimilarity indexfor comparing two hierarchical clusterings on the basis of thewhole dendrograms. They have presented and discussed its basicproperties and have shown that the index can be decomposed intocontributions pertaining to each stage of the hierarchies. Then,they have obtained a similarity index

S

as the complement to oneof the suggested distance and have shown that its singlecomponents

S_k

obtained at each stage

k

of the hierarchies canbe related to the measure

B_k

suggested by Fowlkes \& Mallows(1983) and to the Rand index

R_k

. In this paper, we reportresults of a series of simulation experiments aimed at comparingthe behavior of these new indexes with other well-establishedsimilarity measures, over different experimental conditions. Thefirst set of simulations is aimed at determining the behavior ofthe indexes when the clusterings being compared are unrelated. Thesecond set tries to investigate the robustness to different levelsof nois

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

On multicollinearity and concurvity in some nonlinear multivariate models

Author: MORLINI Isabella
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Recent developments of multivariate smoothing methods provide a rich collection of feasible models for nonparametric multivariate data analysis. Among the most interpretable are those with smoothed additive terms. Construction of various methods and algorithms for computing the models have been the main concern in literature in this area. Less results are available on the validation of computed fit, instead, and many applications of nonparametric methods end up in computing and comparing the generalized validation error or related indexes. This article reviews the behavior of some of the best known multivariate nonparametric methods, based on subset selection and on projection, when (exact) collinearity or multicollinearity (near collinearity) is present in the input matrix. It shows the possible aliasing effects in computed fits of some selection methods and explores the properties of the projection spaces reached by projection methods in order to help data analysts to select the best model in case of ill conditioned input matrices. Two simulation studies and a real data set application are presented to illustrate further the effects of collinearity or multicollinearity in the fit

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Radial basis function networks with partially classified data

Author: MORLINI Isabella
Publication venue
Publication date: 01/01/1999
Field of study

The problem of estimating a classification rule with partially classified observations, which often occurs in biological and ecological modelling, and which is of major interest in pattern recognition, is discussed. Radial basis function networks for classification problems are presented and compared with the discriminant analysis with partially classified data, in situations where some observations in the training set are unclassified. An application on a set of morphometric data obtained from the skulls of 288 specimens of Microtus subterraneus and Microtus multiplex is performed. This example illustrates how the use of both classified and unclassified observations in the estimate of the hidden layer parameters has the potential to greatly improve the network performances

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Variable selection in cluster analysis: an approach based on a new index

Author: MORLINI Isabella
Zani S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

In cluster analysis, the inclusion of unnecessaryvariables may mask the true group structure. For the selection ofthe best subset of variables, we suggest the use of two overallindices. The first index is a distance between two hierarchicalclusterings and the second one is a similarity index obtained asthe complement to one of the previous distance. Both criteria canbe used for measuring the similarity between clusterings obtainedwith different subsets of variables. An application with a realdata set regarding the economic welfare of the Italian Regionsshows the benefits gained with the suggested procedure

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

New weighed similarity indexes for market segmentation using categorical variables

Author: Morlini Isabella
S. Zani
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

In this paper we introduce new similarity indexes forcategorical data with nominal scale. In contrast to traditionallyused similarity measures, they also consider the frequency of themodalities of each attribute in the sample. This feature is usefulwhen dealing with rare categories, since it makes sense todifferently evaluate the pairwise presence of a rare category fromthe pairwise presence of a widespread one. We also propose aspecific weighted index for dependent categorical variables. Thesuitability of the proposed measures from a marketing researchperspective is shown using two real world data sets

Archivio istituzionale della Ricerca - Università degli Studi di Parma

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Assessing decoding ability: the role of speed and accuracy and a new composite indicator to measure decoding skill in elementary grades

Author: MORLINI Isabella
SCORZA Maristella
STELLA GIACOMO
Publication venue: 'SAGE Publications'
Publication date: 01/01/2015
Field of study

Tools for assessing decoding skill in students attending elementary grades are of fundamental importance for guaranteeing an early identification of reading disabled students and reducing both the primary negative effects (on learning) and the secondary negative effects (on the development of the personality) of this disability. This article presents results obtained by administering existing standardized tests of reading and a new screening procedure to about 1,500 students in the elementary grades in Italy. It is found that variables measuring speed and accuracy in all administered reading tests are not Gaussian, and therefore the threshold values used for classifying a student as a normal decoder or as an impaired decoder must be estimated on the basis of the empirical distribution of these variables rather than by using the percentiles of the normal distribution. It is also found that the decoding speed and the decoding accuracy can be measured in either a 1-minute procedure or in much longer standardized tests. The screening procedure and the tests administered are found to be equivalent insofar as they carry the same information. Finally, it is found that speed and accuracy act as complementary effects in the measurement of decoding ability. On the basis of this last finding, the study introduces a new composite indicator aimed at determining the student's performance, which combines speed and accuracy in the measurement of decoding ability

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Searching for structure in air pollutants concentration measurements

Author: Morlini Isabella
Publication venue: 'Wiley'
Publication date: 01/01/2007
Field of study

When studying air pollution measurements at different sites in a spatial area, we may search for a typical pattern,common to all curves, describing the underlying air pollution process in a pre-specified period. Another area ofinterest to support local authorities in air quality management may be the classification of the different sites inhomogeneous clusters and the group ranking that follows. Yet, there is variation in both amplitude and dynamicsamong the air pollutant concentrations measured at the different monitoring stations. Analyzing such measurements,where the basic unit of information is the entire observed process rather than a string of numbers, involvesfinding the time shifts or the warping functions among curves. The analysis is much more complicated if weconsider a multivariate process, that is, vector-valued air pollutant measurements. Following our previous workwhere an improved dynamic time-warping algorithm has been developed, especially in the multivariate case, andused both for classifying functional data and estimating the structural mean of a sample of curves, we analyzed themeasurements of some air pollutants in Emilia Romagna (northern Italy). In addition, for the univariate analyses,we applied the self-modeling warping function approach, which is also convenient for these data. Indeed, thismethod was found to be model-free and enough flexible to capture very complex and highly non-linear patterns

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Some experimental results on the role of speed and accuracy of reading in psychometric tests.

Author: MORLINI Isabella
SCORZA Maristella
STELLA GIACOMO
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

According to the Italian Parliament act (n. 170/2010) that recognizesdyslexia as a physical disturbance, of neurobiological origin, dyslexic children in primary school should be early recognized, in order to asses a targeted intervention within the school and to start a teaching that respects the difficulties in learning to read, to write and to perform calculations. Screening procedures inside the primaryschools aimed at detecting children with difficulties in reading, are not so common in Italy as in other European countries. Nevertheless, screening procedures are of fundamental importance for guaranteeing an early detection of dyslexic children and reducing both the primary negative effects - on learning - and the secondary negative effects - on the development of the personality - of this disturbance. In thisstudy we analyze the validity, from a statistical point of view, of a screening procedure recently proposed in the psychometric literature (Stella et al., 2011). This procedure is very fast (it is exactly one minute long), simple, cheap and can be dispensed by teachers without psychometric experience. On the contrary, the currentlyused tests are much longer and must be provided by skilled teachers. These two major flaw prevent the widespread use of these tests. If the new procedure is found to be reliable, it can be provided to each student in primary school and it can also be repeated in time, in order to monitor the children difficulties. The validity of the procedure and the benchmark with two currently used tests are studied on the thebasis of the results of a survey on about 1500 students attending primary school

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia